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Abstract 

The problem of Text Indexing is a fundamental algorithmic problem in which one wishes to 
prcprocess a text in order to quickly locate pattern queries within the text. In the ever evolving 
world of dynamic and on-line data, there is also a need for developing solutions to index texts 
which arrive on-line, i.e. a character at a time, and still be able to quickly locate said patterns. 
In this paper, a new solution for ondine indexing is presented by providing an ondine suffix tree 
construction in 0(loglogn + log log |E|) worst-case expected time per character, where n is the 
size of the string, and S is the alphabet. This improves upon all previously known ondine suffix 
tree constructions for general alphabets, at the cost of having the run time in expectation. 

The main idea is to reduce the problem of constructing a suffix tree ondine to an interesting 
variant of the order maintenance problem, which may be of independent interest. In the famous 
order maintenance problem, one wishes to maintain a dynamic list L of size n under insertions, 
deletions, and order queries. In an order query, one is given two nodes from L and must 
determine which node precedes the other in L. In an extension to this problem, named the 
Predecessor search on Dynamic Subsets of an Ordered Dynamic List problem (POLP for short), 
it is also necessary to maintain dynamic subsets Si, • • • , Sk C L, such that given some u 6 L 
it will be possible to quickly locate the predecessor of u in Si, for any integer 1 < i < k. 
This paper provides an efficient data structure capable of locating the predecessor of u in 5, in 
O(loglogn) worst-case time and answering order queries on L in 0(1) worst-case time, while 
allowing updates to L in 0(1) worst-case expected time and updates to the subsets in 0(log log n) 
worst-case expected time. This improves over a previous data structure which may be implicitly 
obtained from Dietz |Dic89 |, in which the updates to the sets and L are done in O (log log n) 



amortized expected time. In addition, the bounds shown here match the currently best known 
bounds for predecessor search in the RAM model. 

Furthermore, this paper improves or simplifies bounds for several additional applications, in- 
cluding fully-persistent arrays, the monotonic list labeling problem, and the Order-Maintenance 
Problem. 



1 Introduction 

Text Indexing is one of the most important paradigms in text searching. The idea is to preprocess 
a text T of size n and construct a mechanism that will later provide answers to queries of the 
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form "does a pattern P of size m occur in the text?" in time proportional to m rather than n. 
This paradigm can be seen in many applications of the computing world, including web searching, 
computational biology applications, stock market predictions, and indexing astronomical data. In 
the stock market scenario, for example, one also needs to consider the very large size of the alphabet 
|E| from which the text is drawn. Such a consideration also needs to be taken when dealing with 
astronomical data, such as indexing the sky [ [5KT + 0C ] . The suffix tree and suffix array have proven 



to be invaluable data structures for indexing. 

One of the problems that has occupied the algorithmic community is that of constructing an 
on-line or real-time indexing algorithm. An algorithm is on-line if it accomplishes its task for the 
ith input without needing the i + 1st input. It is real-time if, in addition, the time it operates 
between inputs is a constant. 



Some of the static suffix tree constructions work in the on-line model [Ukk95, Wei73], in which 
one maintains a suffix tree for a text that arrives character by character, and at any given time 
one might receive a pattern query. For simplicity sake, assume that the text arrives from the end 
towards the beginning. This is because it can be shown that a single character added at the end of 
the text can impose a linear number of changes to the suffix tree. Of course, if the text arrives from 
beginning to end, one can view the text in reversed form, and then a queried pattern is reversed 
as well in order to obtain the correct results. The currently best known results for the on-line 
suffix tree construction for general alphabets are an 0(log |E|) amortized time per character by 



Weiner in [Wei73], and O(logn) worst-case per character by Amir, Kopelowitz, Lewenstein, and 



Lewenstein in [AKLLQ5J, where n is the size of the text and £ is the alphabet. As far as constant 



sized alphabets are concerned, Breslauer and Italiano in [ Bill ] recently obtained an algorithm which 
costs 0(log log n + |S|) = O(loglogn) worst-case time per character. For constant-size alphabets 
there is another indexing structure by Amir and Nor [ ANOSfl which does not enjoy the various 
advantages of the suffix tree. 

The results here beat the cost per character of all of the above algorithms for general alphabets. 
However, the time bounds are in expectation (but not amortized). It is proven here that a suffix 
tree can be updated on-line in O(loglogn + loglog|S|) worst-case expected time per character. 
This is under the natural assumption that each character in £ fits into a constant number of words 
in memory. The idea behind the approach used here, as shown in section ||, is to reduce the on-line 
suffix tree construction to a new data structure problem called the Predecessor Search on Dynamic 
Subsets of an Ordered Dynamic List Problem (POLP). This problem is defined and discussed in 
detail next. 

1.1 Predecessor Search on Dynamic Subsets of an Ordered Dynamic List 

The POLP, in a simplistic view, is a combination of two well studied problems: the Order- 
Maintenance Problem (OMP) and the Predecessor problem. 

Order-Maintenance: In the OMP the goal is to maintain an ordered list L of size n under the 
following operations: (1) Insert(u, v) - Insert node u after node v in L. (2) Delete(u) - Delete node 
u from L. (3) Order(u, v) - Determine whether u preceeds v in L. 



The first data structure to solve the OMP was introduced by Dietz in [Die82], which was later 



improved together with Sleator in [DS87], where in the first part of their paper they introduced 
a solution to a different problem known as the Monotonic List Labeling Problem (MLLP). In the 
MLLP the goal is the same as in the OMP with the additional constraint that each node in L is 
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given a unique tag from a range bounded polynomially in n. Then, ordered queries are performed 
by comparing tags in constant time. Dietz and Sleator in [DS87| showed how the MLLP can 



be solved such that each insertion costs O(logn) amortized time. They then used this solution 
to provide a solution for the OMP where each insertion costs O(l) amortized time. About 15 
years later, Bender, Cole, Demaine, Farach-Colton, and Zito in |BCD + 02 | simplified both of these 
solutions, obtaining the same time bounds. They further claimed that their amortized solution can 
be made worst-case, and deferred the description to the full paper, which unfortunately has yet 



to be published. As mentioned by Bender, Cole, Demaine, Farach-Colton, and Zito in [BCD + 02], 
the results of Dietz and Sleator in [DS87] rely on a complicated counterintuitive potential- function 



in order to achieve their amortized cost solution (This is cited by Dietz and Sleator [DS87] as a 
disadvantage of their analysis). Dietz and Raman in [ pR93| obtained another solution for the 
MLLP, where each insertion relabels O(logn) tags in the worst-case, in 0(log 2 n) worst-case time. 
They further claim that it is possible to reduce the time to O(logn) worst-case, but many non- 
trivial details are missing. A lower bound showing that at least O(logn) relabelings need to take 
place per insertion was shown by Dietz, Seiferas, and Zhang in |DSZ94| ] . 

A third closely related problem, called the File-Maintenance Problem (FMP), is the same as 
the MLLP, except now the range of the tags is bounded by 0(n). Willard in [Wil86] was the first to 



provide a very complicated solution for the FMP, where each insertion costs 0(log n) worst-case 



time. Bender, Cole, Demaine, Farach-Colton, and Zito in [BCD + 02] gave a simpler solution for 
the FMP, but some proofs are missing. Recently, Bulanek, Koucky, and Saks in [BKS12] showed 



a matching lower bound. In the second part of [DS87], Dietz and Sleator showed how Willard's 
complicated solution for the FMP can be used to provide a solution for the OMP, where each 
insertion costs O(l) worst-case time. 

Alas, to date there is no published version of a worst-case solution for the OMP which is not 
considered highly complex. This is especially surprising considering that the OMP is a common 
building block for many data structures in dynamic settings. Furthermore, for the purposes of the 



results presented here, the solution of Dietz and Sleator flDS87|1 does not suffice, as is explained 
later. 



Predecessor Queries: Predecessor data structures are ubiquitous in the computer science liter- 
ature. The static predecessor problem is to store a set of integers and perform predecessor queries 
on the set. The dynamic predecessor problem also allows insertions and deletions on the set. In the 
comparison based model an O(logn) lower bound for the predecessor search within a set of size n 
is easy to obtain. However, improved bounds are possible in the RAM modclQ. 

Several data structures have been proposed for the predecessor problem. For example, the van 
Emde Boas data structure [vEB77] was presented as an efficient data structure for small universes. 
Namely, operations are performed in O(loglogu) time for universe of size u. The space for the 
van Emde Boas data structure is 0(u), and can be reduced to 0(n) using randomization. Other 



solutions, such as x-fast tries and y-fast-tries [Wil83], have been suggested as well. There are many 



other results and the interested reader is directed to [BF02, PT06] for other upper and lower bounds 
on the problem. 



Combining the Two: In the POLP, the goal is to maintain a dynamic ordered list L, but, in 
addition to order queries, there are dynamic subsets Si, ■ ■ ■ , C L which need to be supported 



1 AYl of the results in this paper are in the RAM model. 
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to answer predecessor queries. In a predecessor query on a set Si, the input is a node u 6 L and 
the output is the largest element v € Si which is smaller than u, where the order is defined by L. 
For simplicity sake, assume that the subsets are disjoint. An exposition of the case of non-disjoint 
sets is left for the full paper. 

The goal is to support the operations on the subsets in time which is proportional to the size 
of L, and not dependent on the size of the universe from which the elements in L are drawn from. 
In this sense, the POLP setting can be viewed as some sort of embedding. However, this will only 
work efficiently if it is possible to quickly locate the position in L of every new element, which is 
not part of the scope of this paper. Given how useful the order-maintenance data structure has 
been in the data structure world, it is entirely conceivable that other than the on-line suffix tree 
construction many more applications exist for the POLP. 



The POLP was first implicitly solved by Dietz in [Dic89] where a solution for fully persistent 
arrays was introduced. The structure there answers order queries in 0(1) worst-case time, performs 
insertions into L and the subsets in O(loglog?i) expected amortized time, and answers predecessor 
queries in O(loglogn) time. However, the solution there does not deal with deletions. 

In this paper, the focus is on developing a worst-case solution. Thus, the data structure pre- 
sented here performs insertions and deletions to the list in O(l) worst-case expected time and 
to the subsets in O(loglogn) worst-case expected time. Answering order queries is done in 0(1) 
worst-case time and answering predecessor queries is done in O (log log n) worst-case time. 

Due to space limitations, the discussion of deletions is deferred to the full version]^. Nevertheless, 
for the purposes of the applications mentioned here, deletions are not needed. Notice that these 
time bounds match the currently best known bounds for the dynamic predecessor problem. It is 
currently a very interesting open problem whether or not the expectation can be removed from the 
update time in the dynamic predecessor problem, without increasing the query time ( [jPllj] ). 

1.2 The Difficulties 



Dietz and Sleator in [DS87] in their complicated solution for the OMP provide each element in L 
with at most two tags, each with a timestamp, such that given two nodes in L, their order can be 
determined from the tags alone. The tags are integers from a range polynomial in the size of L. 
This gives some intuition as to why one might expect that solving POLP can be done within the 
claimed time bounds. However, there are two main difficulties that need to be dealt with in order 
to solve the POLP efficiently. 

The first is that given how the predecessor data structures use the bit presentation of integers, 
it is not clear how double tags with timestamps could be made to work. The second difficulty is 
that an insertion of a new node into L can cause $l(polylog(n)) tags of elements in L to change, 
which could be costly for a predecessor data structure used directly on the tags. 

The first problem is solved by presenting a new solution for the MLLP problem (see Section |3|) 
where each insertion costs O(logra) worst case time^]. This solution for the MLLP is then used to 
provide a new data structure for the OMP with worst-case bounds, where each element has only 
one integer tag at a given time. This is explained in more detail in Section 01. The second problem 



2 Notice that for the OMP deletions can be easily dealt with by marking nodes as deleted, and using a standard 
rebuilding technique once the number of deleted nodes in L becomes large enough. Also notice that one cannot delete 
a node u £ L if it is in some set. 

3 It is highly conceivable that Bender, Cole, Demaine, Farach-Colton, and Zito in |BCD + 02| had a similar solution 



in mind when they claimed, without proof, that their amortized solution can be deamortized. 
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is solved by tying together the indirections used for the order-maintenance data structure and the 
predecessor data structures, together with careful scheduling of processes. This is explained in 
more detail in Section [El. 



1.3 Fully Persistent Arrays 

Due to space limitations, some more applications of the POLP are deferred to the full paper. 
Nevertheless, it is briefly pointed out here that by replacing Dietz's solution in [Die89] with the 
solution presented here as a black-box, the amortized time bounds of fully-persistent arrays become 
worst-case (though insertions are still in expectation) , which immediately implies improved bounds 
for the general method of making any data structure fully persistent in the RAM model, as described 
by Dietz in pie89]. 



2 On-line Suffix Tree Construction 

In the on-line suffix tree construction, the goal is to support extensions of the text T in which new 
characters are added to its beginning, i.e., constructing the suffix tree of oT from the suffix tree of 
T, where a € X. When referring to the suffix tree the intention is the suffix tree of T before the 
additional character is added, unless mentioned otherwise. 

The discussion here assumes the reader is familiar with the basics of the suffix tree data struc- 
ture. Recall that each node in the suffix tree has a maximum out-degree of |S| (every outgoing 
edge represents a character from E, and any two outgoing edges represent different characters). 
For the purposes here, a hash function is used to map each character to its appropriate edge. In 
addition, for any node u in the suffix tree, the length of the string corresponding to the path from 
the root to u is denoted by length(u), and is maintained within u. In addition, all of the suffixes 
are maintained within a lexicographically ordered list of suffixes. 

The process of inserting the new suffix oT into the suffix tree is broken into three phases. The 
first phase locates the position of the new suffix in the list of sorted suffixes. The second phase 
locates the place in the suffix tree to which the new suffix needs to be added. Finally, in the third 
phase, insertion of the new suffix is implemented by either adding a new leaf as a child of a node 
already in the suffix tree, or by splitting an edge in the suffix tree into two by adding a new node 
u into the edge, and then the new leaf is a child of u. In either case the machinery used needs to 
be updated as well. 



2.1 Phase 1: Searching in the List 

At first glance, maintaining the list of ordered suffixes in a POLP data structure seems to suffice, 
as all that needs to be done is perform a predecessor query on the list of ordered suffixes with the 
new suffix as the key. However, this will not work as the new suffix is not yet in the suffix list, 
while the POLP assumes that the key is already part of the list. 

To solve this, notice that when comparing two different suffixes (or strings for that matter), 
it is possible to break down the comparison process into two. The first comparison is done by 
comparing the first character. If the first two characters are different, then the order of the two 
suffixes is determined by just those characters. Otherwise, the rest of those two suffixes will set 
the order. So when attempting to locate the predecessor of aT: (1) locate the consecutive list in 
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the ordered list of suffixes of T which all correspond to suffixes starting with a, and (2) within this 
sublist, find the predecessor of aT. 

To solve (1) efficiently one can use any predecessor data structure on the different characters of 



E which appear in T. Using a y- fast-trie structure [Wil82], for example, allows to locate the sublist 
in 0(loglog |S|) time. Notice that a hash function will not suffice here as there is a need to know 
the order between the different characters present in T, and it is possible that this is the first time 
a appears. 

To solve (2), notice that the order of the suffixes in the sublist corresponding to suffixes of T 
beginning with a is determined by truncating a from each of those suffixes, and determining the 
order of the remaining substrings. Luckily, each of those substrings is also a suffix of T. So for 
each a € S, a predecessor structure P a is maintained over the nodes from the suffix list which 
correspond to suffixes that begin with a, where the keys are the truncated suffixes. In other words, 
the key for each suffix aT' in this set is the node of suffix T' in the order-maintenance structure. 
Notice that when truncating aT, the remaining suffix T is also in the ordered suffix list, and so 
performing a predecessor query on P a where the key being searched is the node in the suffix list 
corresponding to T will find the location of the predecessor of the suffix aT in the ordered list. It 
will be shown in Section |5| that such a query will cost O(loglogn) worst-case time. 

2.2 Phase 2: Searching in the Tree 

Being that the techniques used in this phase are either standard or use other data structures as a 
black box, only a sketch of the process is presented. 

Once the location in the list of ordered suffixes is found, it is time to locate the place in the 
suffix tree into which the leaf of the new suffix needs to be added. The insertion of the new suffix 
is implemented by either adding a new leaf as a child of a node already in the suffix tree, or by 
splitting an edge in the suffix tree into two by adding a new node u into the edge, and then the new 
leaf is a child of u. In either case, notice that this entry point is on the path from the root of the 
suffix tree to one of the neighbors of aT in the list of ordered suffixes. To determine which of the 
neighbors is the one of interest one can perform a Longest Common Prefix (LCP) query in constant 
time, using the data structure of Franceschini and Grossi in [FG04] . Then, one can locate the 



entry point in O(loglogn) time using weighted level ancestor queries on dynamic trees [KL07], 
where the weight of each node is its length, and the query is the LCP of aT and its appropriate 
neighbor. The entire process takes O(loglogn) time. 

2.3 Phase 3: Updating the Suffix Tree and Machinery 

If the new suffix is inserted as a child of a pre-existing node, then the new edge leading to the 
new leaf is inserted into the appropriate hash function used for navigation down the suffix tree. 
The new leaf needs to be inserted into the machinery used (i.e. weighted level ancestor queries). 
This can be done in O(loglogn) worst-case expected time JKL07 ]. Also, the new suffix needs to 



be inserted into the LCP data structure, which takes constant worst-case time iFGOl , and if a 
is a new character in the text it needs to be inserted into the vEB structure for the alphabet in 
0(loglog|E|) worst-case expected time. Finally, the new suffix is added into the suffix list and 
the node in the suffix list corresponding to T is added to P a , both of which are done by updating 



4 Similar to phase 1, the LCP of two strings can be determined by either the first character, or the LCPs of the 
suffixes without the first character. 
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the POLP data structure. It will be shown in Section || that such an update will cost O(loglogn) 
worst-case expected time. 

If the new suffix is inserted together with a new inner node then the inner node needs to update 
the hash of its parent, create a hash function for itself containing the end of the edge it broke (i.e. 
the previous child of the new inner node's parent), and be inserted into the machinery used on the 
trees. This can be done in O(loglogn) worst-case expected time [KL07|. The insertion of the new 
leaf is performed as before. 

Notice that many of the operations on suffix trees (assuming linear space is desired) use various 
pointers to the text in order to save space for labeling the edges. It is shown by Amir, Kopelowitz, 
Lewenstein and Lewenstiein in | AKLL05|| how to maintain such pointers, called text links, within 



the time and space constraints. Also notice that a copy of the text saved in array format may be 
necessary for various operations, requiring direct addressing. This can be done with constant time 
update by standard de-amortization techniques. Thus, the following is obtained. 

Theorem 2.1. There exists an on-line suffix tree construction where the cost for each addition of 
a character is 0(loglogn + log log |S|) worst-case expected time. 



3 Monotonic List Labeling 



Following the methods of both Dietz and Sleator in [DS87] and Bender, Cole, Demaine, Farach- 
Colton, and Zito in BCD + 02], each element in L is provided with a tag, such that given two 
nodes in L, their order can be determined from their tags alone. For the purpose of this paper, a 
worst-case implementation is needed that can support the needs of the predecessor data structures 
which will be used in Section || 



3.1 The Averaging Method. 

One possible tag-scheme would be to assign a number to a newly inserted node which is the average 
of the tags of its two neighboring nodes. The problem with this solution is that each tag would 
require n bits, and so determining the order of two nodes would take O(j^) time, and not O(l) 
time which is the goal. Thus, a different solution is needed. Nevertheless, if the number of nodes 
in the list is O(logn), then this solution can indeed be used. This solution is named the averaging 
method, and will be used on some small lists in the solution presented for larger lists. 



3.2 The Weight Balanced B-Tree 



Arge and Vitter in fAV0|] introduced the Weight Balanced B-Tree (WBBT). In the WBBT, data 



is maintained in the leaves. The weight of each leaf is defined as the number of elements in that 
leaf, and the weight of an internal node is the sum of the weights of its children, i.e., the sum of 
the weights of the leaves in its subtree. The WBBT is defined as follows, for branching parameter 
a > 4 and leaf parameter k > 0: 

• All of the leaves are at the same depth, and have weight between k and 2k — 1. 

• An internal node of height h has weight at most 2a h k, and every internal node, except for 
the root, has weight at least ^a h k. 

Arge and Vitter proved the following (proof is omitted here): 
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Lemma 3.1 (from [ AV03| ) . Every internal node in the WBBT has between a) 4 and 4a children, 



except for the root which has between 2 and 4a children. 

Corollary 3.1 (from | AV03fl ) . The height of the WBBT with n elements is 0(log a 



For the purpose of the application here, k and a are some constants, and thus the height of 
the WBBT is O(logn). The elements in L are maintained in the leaves of the WBBT. When an 
insertion is made, the O(logn) ancestors of the appropriate leaf are informed that their weight has 
increased. This might cause the size of some nodes to become too large, as their weight is above 
the allowed bounds by the definition of the WBBT. Such nodes are called overflowed nodes. Every 
overflowed node u is split into two new nodes ul and ur, and it's children are divided as evenly as 
possible between the two new nodes. Each split requires constant time, for a total of O(logre) time 
to insert a new node. Arge and Vitter proved the following. 



Lemma 3.2 (taken from [ AV03| ) . If the weight of u prior to it splitting is denoted by W, then 



after the split, the weights of ul and ur are both Q,{W). Thus, at least £l(W) insertions need to be 
made into the subtree of ul (ur) before it must split again. 

Lemma [T^ is a crucial and useful property of the WBBT, as it provides a method of informing 
u's subtree that u has split before another split happens to either ul or ur. This is done by 
performing a scan of the subtrees of ul and ur, which is spread over the insertions into any of 
those subtrees. 

It is also important to notice that if the root ever has to split (due to its weight becoming too 
large), then a new root is created as the parent of the old root. For simplicity sake, assume without 
loss of generality that the root is never split. This assumption can be made because a rebuilding 
scheme can be used in case the weight of the root ever reaches its upper bound. Being that this is 
fairly standard, details are omitted. 



3.3 An O(logn) Implementation 

The Tag Scheme: Denote by H the height of the WBBT. Each element in L is assigned a bit 
string of length (4a + 1)H + 2k = O(logn) as its tag, which is treated as an integer. Each level in 
the WBBT is responsible for 4a + 1 bits, except for the leaves which are responsible for 2k bits. The 
way the responsibility works is that each child v of a node u at level i has a different bit string of 
length 4a + 1, denoted by label(v), which is called the label of v. In v's subtree, all of the elements 
(in the leaves) have label (v) as a substring of their tag, at locations [(4a + l)i + 1, (4a + l)(i + 1)]. If 
the labels of the children of u are assigned in such a way that the labels maintain the order of the 
children of u, then by comparing the tags of two elements, each in a subtree of a different child of 
u, the order of these elements will be determined by the order of the labels of those children of u. 
The reason the scheme works is because the path from the root to u is the same for both elements, 
and so the most significant bit which differs must be related to the labels assigned to the children of 
u. The last 2k bits are assigned by the leaf to the elements within it, using the averaging method. 

It must be guaranteed that the order of the children of u is correctly represented within the 
label of each of the children. This is done by using the averaging method on the 4a least significant 
bits of each label. The use of the extra most significant bit is revealed later. Being that each node 
does not have more than 4a children, 4a bits suffice. It is important to notice that the splitting 
of u into ul and ur is done by setting ul = u, and inserting ur after ul in the list of children of 
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the parent of u. In addition, the labels of the children of u all need to be reassigned in order to 
spread them out within the range defined by 4a bits. This reassignment can be easily afforded as 
the number of children is bounded by a constant. Thus, splitting a node and updating the labels 
of its children takes constant time. 



3.3.1 Updating Tags from New Labels 

Once a split occurs, there is still a need to update all of the elements in the subtrees of ul and ur 
with the new labels. The process of this update is called a tag-process and is denoted by P u for a 
process initiated by u splitting. When an insertion is made into a leaf of the WBBT, the O(logra) 
ancestors of the appropriate leaf are informed that their weight has increased. Each time a node 
has its weight increased it is given 1 unit of time resource which needs to be spent immediately (so 
the time resources do not accumulate). The time resource given to either ul or ur is then given 
to P u and is used to pay for O(l) operations performed by P u . Of course, if P u has completed 
then the time resource is discarded. Notice that there will be situations in which P u will give its 
time resource to a different tag-process to use, as will be explained later. In any case, this time 
resource scheme will guarantee that the total amount of work performed by tag-processes due to 
an insertion into L is bounded by O(logn). 



Denote by W the weight of u prior to the split. Due to Lemma 3^, at least fi(W) insertions 
of elements must be made into either u^s subtree or urs subtree before they split again. Thus, 
if a large enough constant number of leaves in those subtrees is updated whenever P u receives a 
time resource, the tags of the appropriate elements will all be updated with the new labels before 
the next splitting of either ul or ur occurs^. It is important to notice that due to the method 
used in which each level in the tree is responsible for a different part of a leaf's tag, concurrent 
tag-processes updating labels from different levels in the WBBT do not interfere with each other. 

The tag-process P u has three sequential phases. During the first phase, the subtree of ur is 
updated with the new label of ur replacing the previous label of u. This is done by first updating 
the rightmost leaf in the subtree of ur and ending with the leftmost leaf. The order of updates 
is important as to guarantee that order queries asked during the update process are answered 
correctly, even if the order query is performed on nodes of L which are in the subtree of ur. The 
second and third phase, which are interchangeable, are responsible for updating the subtrees of ul 
and ur with the new labels of the children of ul and ur. For consistence sake, the second phase is 
assigned to ul and the third is assigned to ur. The process is shown for ul as the process for ur 
is exactly the same. 



Updating Labels for ul's Children: There are several issues that need to be dealt with while 
updating the elements in ul's subtree, as order queries could be made during the process of updating 
the tags. There must be some guarantee that the tags are consistent with the true order, even if 
an order query is made while a tag-process isn't complete. This is the reason for the extra most 
significant bit within the labels. Before P u is initiated, there is a guarantee that this bit is the 
same for all of the labels of the children of u. Without loss of generality assume this bit is set to 
be 0. When u is split, P u reassigns labels to ul's children, but now their most significant bit is 
changed to 1. When the subtree of ul is updated with the new labels, it begins by updating the 

5 There is also the issue of scanning the subtree, which is fairly standard and is done within the overall 0(W) 
work. 



9 



rightmost leaf in the subtree towards the leftmost leaf. This guarantees that if an order query is 
made between two elements in the subtree of ul then: 

• If both elements have already been updated with the new label of u^s child, then the 4a + 1 
bits for which the children of ul are responsible will correctly determine the order. 

• If both elements have not been updated with the new label of u^s child, then the 4a + 1 bits 
for which the children of ul are responsible are the same as prior to u splitting, and so they 
correctly determine the order. 

• If one element, a has been updated, while the other element /3 has not, then it must be that a 
is larger than (3 (due to the order in which the leaves are updated), and so the most significant 
bit in the 4a + 1 bits for which the children of ul are responsible is 1 for a and for (3. Thus 
the tags correctly maintain the order. 

3.3.2 Collisions of Splitting Processes. 

Let w be the parent of u prior to u splitting, and let v be a child of u prior to u splitting. A 
difficulty arises when either w or v split during the execution of P u . If w splits then care needs 
to be taken with regard to updating the tags in the subtree of ur, as the first phase of P u will be 
using the label assigned to ur by P u , while the third phase of P w will be using the label assigned 
to ur by P w . If v splits into vl and vr then care needs to be taken with regard to updating the 
tags in the subtree of vr, as the second or third phase of P u will be using the label assigned to 
v by P u , while the first phase of P v will be using the label assigned to vr by P v . However, it is 
important to notice that when P u is in a collision of this sort with P w , it cannot be in a collision 
with P v , as collisions with P w can only happen during the first phase of P u , while collisions with 
P v only happen during the second or third phase of P u . Also notice that w (v) splitting before u, 
is analogous to u splitting before v (w). 

To solve these collisions, a careful scheduling of process is needed, as is described next. 

When v splits during P u : In this case, P u assigns a new label to v. Then v splits, and P v 
assigns a new label for vr. At the end of the execution of both P u and P v , the label assigned to vr 
by P v must be the label which is assigned in the appropriate locations in all of the tags of leaves 
in vr's subtree. This is guaranteed as follows. If P u has already updated the subtree of v before 
P v begins updating vr's subtree, then no special modifications need to be made. If P v has already 
begun updating vrs subtree when P u reaches v's subtree (and in particular vr's subtree), then P u 
uses its time resources to help P v finish updating vrs subtree. In such a case, the label assigned to 
v by P u is never used to update any tag in vrs subtree. It is important to notice the significance 
of the MSB in order to understand the correctness of this process scheduling. Finally, if P u is in 
the process of updating v's subtree when P v begins, then first P v uses its time resources to help P u 
finish updating v's subtree with the label assigned to v by P u . Say the amount of resources P v uses 
to help P u is x. When P u is done updating v's subtree, the next x time resources given to P u are 
passed to P v . Notice that P v is guaranteed to receive those time tokens from P u being that every 
time a time resource is given to P v then a time resource is also given to P u as u was an ancestor 
of v. This way, the subtree of v is guaranteed to be completely updated with the label assigned to 
v by P u before the new label assigned to vr by P v is even considered, and the processes are still 
guaranteed to complete on time. 
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When w splits during P u : In this case P u assigns a label for ur. Then w splits, and P w assigns 
a new label to ur. At the end of the execution of both P w and P u , the label assigned to ur by 
P w must be the label which is assigned in the appropriate locations in all of the tags of leaves in 
14r's subtree. This is guaranteed as follows. If P u has already updated the subtree of ur before P w 
reaches ur's subtree, then no special modifications need to be made. Also, in this case, it is not 
possible for P w to be in the process of updating u^'s subtree when P u reaches ur, as P u begins with 
ur on its first step. The only problematic situation in this case is if P u has already begun updating 
lift's subtree when P w reaches ur's subtree. In such a situation, P w uses its time resources to help 
P u finish updating u^'s subtree. Say the amount of resources P w uses to help P u is y. Notice 
that, as opposed to the case considered above, P w cannot be guaranteed that any more insertions 
will be made into the subtrees of either ur or ul, and therefore, the y time resources that P w used 
to assist P u are cannot be guaranteed to be payed back by P u , as u isn't an ancestor of w. To 
solve this, P w performs double the work it would normally do (which is still 0(1)) when assisting 
P u for each of the y time resources, and then, for the next y time resources which are given to P w 
after it is done assisting P u , is also does double the work (this extra work will be going directly 
into updating the subtree of ur with the new label assigned to it by P w ). 



3.4 The Bottom Line 

To recap, each time a new element is added to L, there are O(logn) weight increases, where each 
weight increase might induce a split (which takes constant time), and also might perform a constant 
number of operations to update some labels and tags. Thus, each insertion requires O(logn) worst- 
case time. It is important to notice that due to the nature of the labeling and tagging scheme, it 
is possible to answer order queries correctly even while an update to the WBBT is taking place, as 
each tag-process does not create inconsistencies with other tag-processes. Moreover, the order of 
any two nodes can be decided by the order of the binary presentation of their tags, as opposed to 
using several tags per node, together with some timestamps. Thus, the following has been proven. 

Theorem 3.3. It is possible to solve the MLLP with O(logn) worst-case relabels and time per 
insertion. Furthermore, order queries can still be answered correctly while tag-processing is taking 
place. 



4 Order-Maintenance Data Structure 

Indirection is used in order to achieve an O(l) worst-case time bound per insertion, which closely 
follows the techniques of Dietz and Sleator in [ pS87 |. The list L is partitioned into consecutive 



sublists. Each sublist is called a chunk. The main idea follows from the following lemma. 



Lemma 4.1 (from [L088, DS87|] ). If every k insertions into any chunk, the largest chunk is split 



into two roughly equally sized chunks, then the size of the largest chunk is O(klogn), and the total 
number of chunks is O(r) 



So if every logn insertions into any chunk a split takes place, the total number of sublists is 

and the size of each chunk is 0(log 2 n). I 
i chunks with the appropriate operations, and hov 
solution to the MLLP to efficiently solve the OMP. 



^(logn)' an< ^ ^he size of each chunk is 0(log 2 n). In the following it is shown how to implement 
the chunks with the appropriate operations, and how the chunks are used in collaboration with the 
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4.1 The Chunks 



The implementation of each chunk of size 0(log 2 n) is as follows. Each chunk is maintained with a 
tree of depth 2. The nodes in the sublist maintained by the chunk are the leaves of this tree which 
are all at depth 2. Each non-leaf other than possible the root has between ^^jp and logn children. 
The root has at most O(logn) children. For every non-leaf node, the order of its O(logn) children 
is maintained using the averaging method as described above. 
The following operations are needed on each chunk. 



4.1.1 Order Query 

The order of any two leaves which are siblings can be determined from the tags given to those 
leaves by their parent, and the order of any two leaves with different parents can be determined by 
the tags of the parents given to them by the root. In any case this takes 0(1) time. 



4.1.2 Insertion 

When a new node u is inserted into a chunk, it will always be added after a node v which was already 
in the chunk. Let p be the parent of v in the depth 2 tree, and let r denote the root of this tree. 
At first, u is inserted after v in the order structure of p's children (using the averaging method). 
If the insertion of u increases the number of children of p to be more than log n, then p is split 
into two by creating a new sibling called p'. This new sibling is inserted as a child of r, following 
p in the order of the children of r. Notice that the number of children of r is always bounded by 
O(logn). Starting from this point in time till ^p insertions are made into the children of either p 
or p', each such insertion transfers the last two nodes from the children of p to the children of p'. 
Then, during the next ^p insertions into the children of either p or p', p reassigns the labels of its 



children as follows. Using the same technique as in section 3.3, each label has an extra bit at the 



most significant location, and p guarantees that the order is maintained during the reassignment 



using this bit (details are similar to those of Section [y| and are thus omitted). In total, the entire 
process takes O(l) time per insertion. 



4.1.3 Tracking the median 

It will become apparent later that there is a need to track the median of each tree of depth 2 as 
insertions are made into that tree. This is done as follows. Let m be the median of the chunk 
prior to the insertion, and in addition, maintain the number of nodes preceding m and the number 
of nodes following m. When an insertion happens, the first step is to discover if this new node 
precedes or follows m. This is done with an order query. Then the appropriate counter is updated. 
If the counters differ by more than 1, then the median needs to move one step in the appropriate 
direction in the sublist in order to balance them out, updating the counters accordingly. This entire 
process takes 0(1) time per insertion. 



4.1.4 Locating the largest chunk 

Locating the largest chunk is fairly standard and can be done using an auxiliary dynamic array of 
size 0(log 2 n), where each entry in the array is a doubly linked list of all chunks of size equal to 
the index of that location. In addition, all of the non empty locations in this array are maintained 
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in a doubly linked list. The key observation is that a size of a chunk can only change by 1 due to 
an insertion, and therefore, changes to this doubly linked list are very local. Details are standard 
and are thus omitted. 



4.1.5 Splitting around the median 

Recall that every log n insertions into L, the largest chunk needs to split into two chunks of roughly 
equal size. This is done as follows. Let m be the median of the chunk, let p be its parent, and 
let r be the root. Then p needs to be split into two around m, and r needs to be split into two 
around p, creating a new chunk. The splits are done using the same splitting method which is 
used during the insertion, so details are omitted. The total amount of time needed to perform 
this splitting is O(logn), and this process is spread over the next logn insertions made into L, for 
a total of O(l) time per insertion. Notice that it is possible that a splitting process of p due to 
it having too many children, as described during the insertion process, is happening concurrently 
with a splitting process of p due to a chunk splitting around m. This situation can be solved by 
performing all operations twice as fast, completing the splitting process which began first, and only 
then proceeding to the next splitting process. This is similar to the techniques used in section |3.3| , 
where colliding tag-processes pass their time resources to other tag-processes. Such a case still costs 
only 0(1) time per insertion. 



4.2 Combining Chunks with Monotonic List Labeling 

As mentioned above, the list L is partitioned into 0(j^-^) chunks, and a new chunk is created every 
log n insertions into L. In addition, a list of the roots of the chunks, ordered by L, is maintained via 
the solution presented in Section || for the MLLP. Denote this ordered list of roots by L r . The size 
of L r is O(to^), and every logn insertions into L, one insertion is made into L r due to the largest 
chunk splitting. This process of inserting a new root into L r costs O(logra) time, and is spread 
over the following logn insertions made into L, before another insertion is made into L r . Recall 
that the insertion into the monotonic list labeling structure can be done in parts without affecting 



order queries, due to Theorem 3.3. Thus the total time per insertion is O(l) in the worst-case. 



4.3 Answering Order Queries 

An Order(u,v) query is answered as follows. First it needs to be established whether u and v are 
in the same chunk or not. This is done in O(l) time by checking if the root of the chunk of u is the 
same as the root of the chunk of v. If the chunks are the same, then the query is answered directly 
through the chunk. If not, then the query is answered by comparing the tags of the roots of the 
chunks given by the monotonic list labeling structure. For simplicity sake, consider the tag of each 
node in L to be the concatenation of the tag of its chunk representative in L r , followed by the tags 
of its parent and itself within its chunk. This will simplify the explanations in Section [|. 
Thus, the following has been proven. 

Theorem 4.2. It is possible to solve the order maintenance problem where each operation costs 
0(1) time in the worst-case. 
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5 Adding Predecessor Queries 



In this section it will be shown how the POLP can be implemented within the bounds claimed. 



The results are summarized by Theorem 5.1 at the end of this section. 
5.1 y- fast-tries 



The y-fast-trie [Wil83] is picked as the predecessor data structure of choice in order to achieve the 
desired bounds, due to its simple presentation. It is possible to achieve the same bounds with other 
structures (such as the van-Emde Boas data structure vEB77]). The y-fast-trie allows to answer 



predecessor queries over a set S of size m taken from universe U in O(loglogu) time, where u = \U\. 
In addition, as will be shown, updates to S (insertions and deletions) can be done in O(loglogu) 
expected time. The space usage is 0{m) words. 

Being as the details of implementation of the y-fast-trie are of importance in the setting here, 
they are described briefly next. The y-fast-trie is based on another structure called the x-fast-trie, 
which is a trie of the binary presentations of elements in S (so an edge to a left child corresponds to 
0, while an edge to a right child corresponds to 1). Thus, the height of the x-fast-trie is logu, and 
each node corresponds to a prefix of a binary presentation of some element (possibly more than 
one) in S. If a node only has a right child, then it maintains a pointer to the leaf with smallest 
key in the subtrie of its right child. Likewise, if a node only has a left child, then it maintains a 
pointer to the leaf with largest key in the subtrie of its left child. Finally, each node is maintained 
in a dynamic hash table, with the key being the binary prefix corresponding to the node (together 
with its length). Roughly speaking, a predecessor search on x € U performs a binary search on 
the binary presentation of x, and takes O(loglogu) worst-case time. An insertion of x G U is 
performed by inserting the new nodes corresponding to prefixes of the binary presentation of x 
into the hash table, and possibly updating pointers from internal nodes to some leaves. This takes 
O(logii) worst-case expected time, where the expectation is due to the dynamic hash table. The 
space usage of the x-fast-trie is O(mlogn) words. 

Typically, the y-fast-trie uses the x-fast-trie as a top structure together with standard bucketing 
techniques. S is partitioned into 0(j^^) buckets, each bucket with O(logu) consecutive elements. 
Each bucket is maintained in a balanced binary search tree, BBST for short (AVL trees, or red- 
black trees). In addition, each bucket sends one representative to the x-fast-trie, which is now built 
on only elements, and so the space usage is now 0(m). An insertion is performed by inserting 
into the BBST of the appropriate bucket, and splitting the bucket if needed (causing an insertion 
to the x-fast-trie). A bucket is only split after G(logu) insertions are made into that bucket, and 
so the time for inserting into the y-fast-trie is O(loglogfi) amortized expected time. A query is 
performed by first querying the x-fast-trie, and then searching in the BBST of the appropriate 
bucket (with possible 1 or 2 more buckets near it) in O(loglogu) worst-case time. 



However, the technique from Lemma 4.1 can be used in order to make the insertion time 
worst-case expected by splitting the largest bucket every logn insertions. This way, S is still 
partitioned into O(j^) buckets, but each bucket has 0(log 2 u) consecutive elements. Nevertheless, 
a predecessor search within a bucket still costs O (log log u) time in the worst-case. An insertion 
into the x-fast-trie is now done as follows. Each node in the x-fast-trie is given a timestamp of 
when it was created. When the insertion process begins, the timestamp r prior to the insertion 
is saved, and any predecessor query that is performed during the insertion process will ignore any 
data that has a timestamp after r. Once the insertion phase is completed, the structure is informed 
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that it may ignore r (or r can be updated to the new timestamp after the insertion took place). 
Notice that the pointers to smallest or largest elements in some subtries also need to maintain these 
timestamps, and possibly another pointer to differentiate between the pointer prior to time r and 
the pointer after time r. The O(logu) work needed to update the x-fast-trie is spread over the 
following 0(logit) insertions into any bucket, and finishes before another bucket splits. In addition, 
the splitting of the buckets is also done during those following 0(logu) insertions. 

5.2 Scheduling Splits of Buckets and Chunks 

Let m = Ya=i One option for solving the POLP is to maintain each of the sets Si, ■ ■ ■ , Sk C L 
in a y-fast-trie, with the key of each element being its tag from the order-maintenance structure 
from Section Notice that each key is contained within O(logn) bits, as the universe size of the 
tags is bounded by n c for some constant c. However, the problem with this solution is that each 
new element inserted into L can cause a poly-logarithmic number of elements to change their tags 
(as it changes the tag of 0(1) chunks), which needs to be reflected by changing the keys in the 
y-fast-trie, and can be rather costly. 

The first observation that can help solve this problem is to notice that the keys which need 
to be readjusted in the y-fast-trie are the keys of representatives of buckets, as the other elements 
are maintained in a BBST, and so their ordering never changes regardless of their tags. The 
second helpful observation is that while each insertion into L causes many tags to change in the 
order-maintenance structure, only O(l) tags of representatives of chunks in the monotonic list 
labeling structure are changed. Thus, the main idea is to guarantee that each chunk in the order- 
maintenance structure will contain 1 (possible 2 during a split) bucket representative from any of 
the y-fast-tries of any of the sets. However, care needs to be taken to guarantee that splitting 
process caused by buckets and chunks splitting do not interfere with each other. To this end, some 
more modifications are needed, as is described next. 



Unifying splitting processes: The first step in order to guarantee that each chunk has at most 
1 bucket representative is to unify the splitting process over all the y-fast-tries of all of the sets. So 
now, every logn insertions into any of the y-fast-trie structures of any set, the largest bucket from 



all of the y-fast-tries is taken to be split. This guarantees, by Lemma 4.1, that only one y-fast-trie 
will be in a midst of an insertion process into its x-fast-trie component, while the size of any bucket 
is bounded by 0(log 2 n). Furthermore, the total number of buckets in all the y-fast-trie structures 
is 0(i3^)> and so the total space used by all of the y-fast-tries is still linear. 

Interfering splits: Each time a new bucket representative is created, if the chunk which contains 
this representative already has a different representative within it, it will need to be split via a 
splitting process denoted by Pb- On the one hand this seems reasonable as such a process takes 
place only once every logn insertions into y-fast-trie structures, so its work can be spread over those 
insertions. However, it is possible that the order-maintenance structure is already in the midst of 
a chunk split process, denoted by P c , due to its own machinery. Alternatively, it is possible that 
P c wants to begin while Pb is currently executing. 

This difficulty is solved as follows. The solution is shown for the first case (i.e. Pf, begins while 
P c is in the midst of executing), as the second case simply reverses the roles of the processes. 



Going back to the terminology of section 3.3, each insertion into any of the y-fast-tries gives a time 



resource to Pf,, while each insertion into L gives a time resource to P c . Pf, uses its time resources to 
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help P c finish, but now P c needs to perform double the work per each time resource it receives. Let 
x be the number of time resources passed from Pf, to P c . When P c finally completes the current 
split, Pb uses its next x time resources to do double the work of what it normally would do, allowing 
it to catch up to where it would have been had the interference with Pb not taken place. Notice 
that at most half the time resources given to P c will be passed on to Pf,. Thus, each time resource 
is still used to perform 0(1) work. 



Final run-time tuning: Each insertion into either a y-fast-trie bucket or a chunk causes O(l) 
representatives to change their tag. However, each such change of a tag will actually cost O(logre) 
worst-case expected time, as the binary presentation of the representative is changed, and this 
needs to be reflected in the x-fast-trie portion of the y-fast-trie. To solve this, the definitions of a 
chunk and bucket are slightly changed. Instead of creating a split every logn insertions, now a split 
is created every log 2 n insertions. Following Lemma 4.1 , the size of the largest chunk or bucket is 
now bounded by O (log 3 re). The only additional change is that each chunk is implemented with a 
tree of depth 3 instead of a tree of depth 2. The rest of the details remain the same, up to some 
constants. 

Now that a split occurs every log 2 n time resources, the scheduling is done as follows. Every 
log n time resources, 0(1) work is performed on the monotonic list labeling structure from Section]^. 
This causes O(l) bucket representatives to change their tag, and so the O(logre) work needs to be 
performed in order to update the appropriate y-fast-tries with this tag change is spread over the 
next logn time resources. Thus each time resource pays for O(l) work in expectation. 



Running time: The following has been proven. 

Theorem 5.1. There exists an data structure for a dynamic ordered list L of size n and (disjoint) 
dynamic subsets Si, ■ ■ ■ ,Sk Q L such that:(l) order queries are answered in O(l) worst-case time, 
(2) inserting a node u after a given node v £ L takes 0(1) worst-case expected time, (3) for any 
1 < i < k, inserting an element from L/U ^Sj into Si takes O(loglogn) worst-case expected 
time, and (4) for any 1 < i < k and u € L, locating the predecessor of u in Si takes O(loglogre) 
worst-case time. 
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